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We report a comprehensive approach to analysing continuous-output photon detectors. We em¬ 
ploy principal component analysis to maximise the information extracted, followed by a novel noise- 
tolerant parameterised approach to the tomography of PNRDs. We further propose a measure 
for rigorously quantifying a detector’s photon-number-resolving capability. Our approach applies 
to all detectors with continuous-output signals. We illustrate our methods by applying them to 
experimental data obtained from a transition-edge sensor (TES) detector. 


The continuing development of highly efficient photon 
detectors has significant impact across a broad range of 
fields, from quantum information [1] to astronomy [2] 
and biomedical imaging [3]. The physics underlying the 
operation of different photon detectors is rich and var¬ 
ied, but their outputs typically fall into two categories. 
Those such as photomultiplier tubes, avalanche photo¬ 
diodes [4] and superconducting nanowires [5, 6] are of¬ 
ten based on avalanche phenomena and lead to discrete 
‘click’ outcomes, while others, such as transition-edge 
sensors [7], kinetic-inductance detectors [2] and super¬ 
conducting tunnel junctions [ 8 ] rely on smooth transi¬ 
tions leading to continuous ‘trace’ outputs (avalanche 
photodiodes can also give continuous-valued outputs un¬ 
der appropriate conditions [9]). Some of these, includ¬ 
ing TES detectors, are highly sensitive single-photon de¬ 
tectors with quantum efficiencies of up to 98% [7, 10] 
and true photon-number sensitivity [7]. Others, such as 
microwave kinetic inductance detectors, allow unprece¬ 
dented level of integration into large arrays [2]. These 
advances over traditional discrete-output detectors will 
enable new applications in wide-ranging fields. 

With these novel applications and regimes of perfor¬ 
mance come additional challenges in detector character¬ 
isation. Unlike discrete-output detectors, many photon- 
number resolving detectors (PNRD) produce a complex 
time-varying signal from which the input state must be 
inferred. Efficiently extracting information from these 
signals is therefore necessary to realise the full capability 
of such detectors [11-13]. 

The signal produced by a continuous-output detector 
is typically a time-dependent voltage with some depen¬ 
dence on photon number which may in general be non¬ 
linear, as shown in Fig. la. A set of such output signals 
V = {vi(t)}, arising from a set of input states of the in¬ 
cident light beam, can be represented using a set of basis 
functions {wj(t)}, such that 

n 

Vi(t) = ^2s ij w j {t). 
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In general, this implies that, in order to capture the full 
output of the detector, it is necessary to determine the 


weighting components Sij for all of the n basis functions 
for each signal to be measured. For a truly continuous 
signal, n is in principle infinite, but of course for any 
real experiment the upper limit to n is set by the tem¬ 
poral and voltage resolution of the detector. However, 
this finite signal still spans a space of high dimension; in 
our work a signal consists of 1024 16-bit numbers. Di¬ 
rectly analysing this signal is therefore impractical. This 
is particularly the case for detector tomography, neces¬ 
sary to rigorously characterise the relationship between 
input states and output signals [14, 15]. Detector to¬ 
mography requires a sufficiently small space of outputs 
that the probability of a given outcome can be estimated 
precisely from the measured data. For the full output 
space of our detector signal, we estimate the probability 
of the same trace occurring twice (to within the resolu¬ 
tion of the analogue-to-digital converter) in a data set 
of 10 5 traces to be on the order of 10~ 4 , rendering to¬ 
mography in this full space infeasible. This motivates 
the development of an approach to the characterization 
of continous-output detectors that enables accurate and 
precise signal analysis and detector tomography. 

Detector tomography has been previously carried out 
for continuous-output PNRDs with 5% quantum efficien¬ 
cies [12], in which the continuous-output problem was cir¬ 
cumvented by ‘binning’ the detector output based on the 
maximum amplitude of the signal. This approach does 
not make optimal use of the information available. Fur¬ 
thermore, as we will discuss, the numerical techniques 
for detector tomography used in the study are not ef¬ 
fective in the high detection-efficiency regime, which is 
now accessible with TES detectors. Another recent work 
has explored algorithmic methods of interpreting the re¬ 
sponse of high detection-efficiency PNRDs based on clus¬ 
ter analysis [13]. Although this may prove useful for rapid 
characterisation of a detector, it is not a tomographic 
technique and is therefore unable to provide a rigorous 
characterisation of the detector response. 

Principal component analysis: We first consider the 
problem of efficiently extracting information from a high¬ 
dimensional detector signal data set. We achieve this 
by employing a standard technique from multi-variate 
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FIG. 1. a) Representative TES traces Vi(t) from a data set of 
180,000 total signals, b) Truncated representation of the same 
traces using only the first two principal components wi (t) 
and W 2 (t). c) Variance of the set of principal component 
scores {sij} as a function of the principal component number 
j. d) Principal components Wi (t) (blue) andu) 2 (t) (green), e) 
Probability density function p(sii|a) over the signal scores s;i 
for a coherent state input |a)(a| with a mean of 3.1 photons 
per pulse. Note that these values can be negative as the mean 
signal is subtracted from each signal during the calculation of 
s;i 


statistics, namely principal component analysis [16]. For 
a given data set, this approach determines the optimal 
set of ‘principal component’ basis functions {wj(t)} such 
that each successive basis function captures the maxi¬ 
mum amount of information possible from the data set 
(as measured by the variance of the ‘principal compo¬ 
nent scores’ Sij), while maintaining orthogonality with 
the previous components. Crucially, this implies that if 
the principal component basis is truncated to compress 
the data, the maximum amount of the variance of the 
original data set will still be captured. In other words, 
the truncated principal component basis will provide the 
most faithful reconstruction of the data for a given num¬ 
ber of components. 

In an actual experiment, the signals v t (t) and therefore 
the basis functions Wj(t) are necessarily discretised due 
to the finite temporal resolution of the detector. In this 
case the set of signals V can be expressed as a matrix. 
It can be shown that the problem of determining {wj(t)} 
for V is equivalent to finding the eigenvectors of the ma¬ 
trix V T V , where V is the data set with the mean signal 
subtracted [16]. These eigenvectors can be efficiently de¬ 


termined using singular value decomposition. Once Wi(t) 
are known, can be calculated from the detector signals 
vj(t) by = J Vj(t)wi(t)dt. 

We applied principal component analysis to a data set 
of 180,000 TES traces, taken with a range of 300 differ¬ 
ent coherent state inputs with average photon numbers 
spanning from 0 to approximately 15 photons per pulse. 
In Fig. la & b, example TES traces from this data set are 
plotted both in their original form, and in a reduced form 
using only the first two principal components wq(t) and 
u >2 (t). As can be seen, with just these two components, 
most of the structure of the traces has been reproduced. 
This can be shown more formally by comparing the vari¬ 
ance of {Sij} for different principal component numbers j, 
as plotted in Fig. lc. The variance of {s^i} is two orders 
of magnitude greater than {s^}, and this trend contin¬ 
ues, with the variance rapidly decreasing as a function of 

j- 

Interestingly, as Fig. Id shows, w\(t) is very close to 
the mean shape of the TES traces. This would be ex¬ 
pected theoretically in the small-signal limit, in which 
the TES trace height simply scales linearly with the pho¬ 
ton number [17]. This confirms that projecting onto the 
mean trace shape, as used by [13], is a useful approach for 
distinguishing TES signals in the few-photon limit using 
only a single parameter. Beyond providing a justification 
for this choice of processing method, the higher order 
principal components that are revealed by our analysis 
can provide additional data with which to characterise 
the response of a detector, particularly for higher photon 
numbers. For example, w 2 (f) captures the increase in the 
pulse length with photon number due to an increase in 
thermal recovery time [18]. However, since the dominant 
contribution to the data variance is from wi(t), partic¬ 
ularly for the low photon numbers considered here, we 
choose to solely focus on this component for the remain¬ 
der of our analysis. 

Detector tomography: We now seek to determine the 
correspondence between the reduced detector signals and 
the input number of photons by carrying out detector 
tomography [14]. The goal of detector tomography is to 
determine the positive-operator-valued measure (POVM) 
{7r(s)} that fully characterises the detector response; this 
is parameterised by the outcome s in the space of Sn. 
Once the POVM is known, the probability density for 
detector outcome s, given input state p, is determined 
by the Born rule 

P(s\p) =Tr[pn(s)}. ( 1 ) 

The standard approach to tomography consists of exper¬ 
imentally estimating the outcome probability densities 
p(s\pk) for a set of known probe basis states {pk}- Using 
these estimated probabilities, equation (1) can then, in 
principle, be inverted to find 7r(s). 

The set of probe states {pk} must provide a sufficient 
basis for the operator space of the POVM {7r(s)}; in other 
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words, it must be tomographically complete. We satisfy 
this constraint by using a well established method [14] for 
tomography of PNRDs based on coherent state probes 
|a). It is well known that coherent states form an over¬ 
complete basis for an optical mode. Coherent states are 
also straightforward to generate in the lab and are insen¬ 
sitive in form to experimental losses during preparation, 
making them ideal probe states. Additionally, as TES de¬ 
tectors are phase insensitive, their response depends only 
on the magnitude of the coherent state parameter a, and 
not its phase. This significantly reduces the number of 
probe states needed to form a tomographically complete 
set of basis operators and removes the need for any phase 
reference in the experiment. 

A phase insensitive detector will have POVM elements 
diagonal in the photon-number basis; these can therefore 
be expressed as 


^(s) = 0 n (s) \n)(n \. (2) 

n =0 

Coherent-state probes are given in this basis by [19] 

00 n 

I a) =exp(—M 2 /2)^-^|n), (3) 

Inserting equations (2) & (3) into the Born rule (equa¬ 
tion (1)), we find that the probability density for a given 
outcome is 


p{s\a) = ^2 F a,n On(s). (4) 

n—0 


where F a<n = \a\ 2n exp( J a| } 

Using the set of probability density functions p(s\otk) 
associated with the input probe states {la*,}}, this rela¬ 
tion can be numerically inverted to find the best solution 
for 0 n (s) consistent with the physicality constraints 

9 n (s) > 0, and J 9 n (s) ds < 1. 

It is necessary to use a calibrated light source in order 
to produce coherent-state probes with known energies 
for detector tomography. Since we do not have access to 
a source calibrated to a radiometric standard, we built 
our own calibrated source by using a Newport 918D-IG- 
OD3R power meter, which provides a specified calibra¬ 
tion accuracy of 2% of absolute power and a linearity of 
better than 0.5%. This power meter was used to calibrate 
a series of fixed attenuators to reduce the output from 
a pulsed laser to the single-photon level with a known 
mean-photon number per pulse [20]. 

We measured the detector response to a set of 300 dif¬ 
ferent probe energies equally spaced between 0 and 15 
photons per pulse. For each probe energy, we ran 49152 
trials, and used the measured signals to estimate the 


probability density function for the outcomes in the space 
of sn [21]. Fig. le shows an example measured probabil¬ 
ity density function for a probe state with a mean of 3.1 
photons per pulse. 

It is well known that the problem of inverting equa¬ 
tion (1) to obtain 7r(s) is ill-conditioned [14]. We found 
that published methods of performing this numerical in¬ 
version based on constrained least squares techniques [12] 
did not give satisfactory results [22]. This may be in part 
due to the reduced overlap between the POVM elements 
for different photon numbers as compared to previous 
studies because of our much higher system detection ef¬ 
ficiency. This means that regularisation techniques de¬ 
signed to promote this overlap [14, 15] do not work as 
effectively. 

We used insights from our collected data to develop 
a novel detector tomography routine that is effective for 
high quantum efficiencies. We adopted a model in which 
the detector response to photon number n (in the space 
of Si) is given by the sum of n+1 Gaussians, with widths, 
heights, and positions as free variables. This Gaussian- 
mixture model [23] is consistent with detectors for which 
several different sources of noise contribute to the re¬ 
sponse of the detector to a given photon number, leading 
to an overall Gaussian error as might be expected from 
the central limit theorem. We employed a maximum like¬ 
lihood routine [24] to find the parameterised POVM that 
was most consistent with the full coherent state tomog¬ 
raphy data set. 
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FIG. 2. a) Fock state POVM elements determined from our 
parameterised detector tomography routine. Note that these 
solutions are continuous functions in the space of s;i, and have 
not been arbitrarily binned into different ‘photon-number’ 
outcomes, b) Foc.k state POVM elements after incorporat¬ 
ing the uncertainty in the probe state energies. 


The results of this inversion are shown in Fig. 2a. The 
efficacy of this model-based routine can be estimated 
by using the calculated POVM to reconstruct the orig¬ 
inal data set. The L\ difference between this recon¬ 
struction and the original data (normalised by the L\ 
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norm of the original data set) is 0.054 as compared to 
0.047 for the unphysical reconstruction given by a least- 
squares approach, showing that this model equally ef¬ 
fectively captures the detector response while being sig¬ 
nificantly more robust to noise. The model-based ap¬ 
proach also allows us to estimate the system detection 
efficiency from the tomography data, giving an efficiency 
of 0.98(+0.02/-0.08) [25]. 

As a final step, it is necessary to incorporate the uncer¬ 
tainty in the coherent-state probe energies [26] to give the 
POVM elements shown in Fig. 2b. The higher photon- 
number POVM elements are particularly sensitive to this 
uncertainty, and show correspondingly large deviations 
from their ideal values. This highlights the crucial impor¬ 
tance of an accurately calibrated probe state source for 
detector tomography. Our setup has a high calibration 
uncertainty of 8%, however, calibration uncertainties of 
less than 1% are achievable [27, 28]. Since this shortcom¬ 
ing is not intrinsic to our detector, in the following anal¬ 
ysis we will assume such a 1% calibration uncertainty, as 
this allows us to better demonstrate the information that 
our protocol can provide. 

Characterising photon-number resolution: The above 
tomography procedure gives the probability density 
p(s\n) for a specific outcome s given an n-photon in¬ 
put to the detector. However, in typical experiments, 
we are actually interested in the complementary proba¬ 
bility density p(n\s) that the input contained n photons 
given that the detector measured outcome s. Determin¬ 
ing this requires Bayes’ theorem p(n|s) = p(s\n)p(n)/p(s) 
and thus depends on our prior probability p(n) of an 71- 
photon input [29]. 

Closely linked to determining p(n\s) is the problem 
of finding a quantitative measure of the ‘photon-number 
resolution’ of the detector. Since p[n\s) only gives infor¬ 
mation on the confidence with which a specific outcome 
s can determine the photon-number input, we propose 
a measure that represents an average of this confidence, 
weighted by the probability density for s given n input 
photons, 


C n 


Mwn , p(s\n) 2 p(n) 

p{n\s)p{s\n) as = / -^-ds 

J -oo P(s) 

[°° p{s\nfp{n) 
j-ooTMmk) s ' 



Given an input of n photons, this confidence C n repre¬ 
sents the average probability ascribed to the n photon 
component of the inferred state p{s) = ^2 n p(n I s ) \ n )( n \- 
More loosely, it represents the probability that the de¬ 
tector gives the correct photon number. Additionally, 
Cn = f (n | p(s) |n)p(s|n)ds, the average squared fidelity 
between the inferred detected state and an n photon 
number state |n), weighted by the probability p(s\n). 
For the detection of a heralding state from a spontaneous 
parametric down-conversion (SPDC) source [30] , this will 


therefore also be the fidelity of the heralded state with 
|n). Note that the detector does not have information 
on the specific input photon-number n: however, a prior 
distribution must be specified. This confidence is there¬ 
fore a function of the distribution chosen. Fig 3a shows 
the confidence for different photon numbers as a function 
of the SPDC source thermal prior distribution parameter 
A 2 , where p(\n, n) | A) = (1 — A 2 )A 2n . 
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FIG. 3. a) Calculated confidence C n for different photon num¬ 
bers as a function of the thermal prior distribution parameter 
A 2 , b) Calculated confidence for our detector given a flat 
prior, as a function of photon number n (blue). Confidence 
for outcomes at the centres of the peaks in p(s\n) (dashed 
yellow). Confidence for a time-multiplexed pseudo-number¬ 
resolving detector (dashed green). 


In order to facilitate comparison between different de¬ 
tectors, it may be useful to determine this confidence 
given a flat prior for the photon number, 


C n 



P( s > n ) 2 d5 
E Mk) ■ 


This is plotted in Fig. 3b. As would be expected, our 
detector is extremely effective at resolving vacuum and 
lower photon numbers, while for higher photon num¬ 
bers, the increasing effect of the detection inefficiency 
and gradual saturation of the detector leads to a reduced 
confidence in the outcomes. As an example of the addi¬ 
tional information given by our continuous-output anal¬ 
ysis, we also plot the confidence for a post-selected case, 
in which only outcomes at the centres of the peaks in 
p{s\n) (Fig. 2) are accepted [31]. This could be employed 
to boost the fidelity of the heralded Fock states produced 
by SPDC sources. In order to demonstrate that this mea¬ 
sure is widely applicable to different PNRDs, the confi¬ 
dence for the time-multiplexed pseudo-number-resolving 
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detector with 8 time bins presented in [14] is also shown. 
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PRINCIPAL COMPONENT ANALYSIS 

The main text of the paper focuses on only the first 
principal component w\(t). Although, particularly for 
lower photon numbers, this component provides most 
of the distinguishing information available (as measured 
through the data covariance), it is interesting to note that 
higher order components can contribute additional infor¬ 
mation. In Fig. 4 we plot example probability density 
functions in the space of s^i and for different coher¬ 
ent state probes. Structure along s, 2 is visible, and could 
be incorporated into a detector tomography analysis to 
further distinguish input states. 
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FIG. 4. Example probability density functions in the space of 
the first and second principle component scores for coherent 
state probes with varying average photon numbers. 


GAUSSIAN-MIXTURE MAXIMUM 
LIKELIHOOD ESTIMATION 

We found that published methods of performing detec¬ 
tor tomography based on constrained least squares tech¬ 
niques [12] did not give satisfactory results, with the re¬ 
sulting POVM clearly showing unphysical noise features 
(Fig. 5). This may be because the reduced overlap be¬ 
tween the POVM elements for different photon numbers 
as compared to previous studies means that regularisa- 
tion techniques designed to promote this overlap [14] do 
not work as effectively. 

As introduced in the main text, we have developed a 
novel detector tomography routine that is effective for 
high quantum efficiencies. In this approach, we parame- 
terise the detector response in as a series of overlap¬ 
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FIG. 5. Constrained least squares solutions for the Fock state 
POVM elements II n , s showing the responses to vacuum and 
up to 17 photons. The solution has unphysical noise features. 


ping Gaussian distributions. Specifically, we model the 
response of our detector to a photon number n as com¬ 
posed of a sum of n + 1 Gaussians, with widths, heights, 
and positions as free variables. Our approach is read¬ 
ily extendable to higher order components (si 2 , Si 3 ...), 
however, as in the main text, we choose to focus on sn 
here. This model gives the following expression for the 
POVM coefficients for photon number n, 

^ ^ A"(s, &n,j )• (b) 

j 

where /3 nj - is a weighting factor for the Gaussian prob¬ 
ability distribution J\f(s\fi n j,a n j) in the outcome space 
s, with mean /r nj -, and standard deviation 

We imposed the constraint that n n ,j = Pn+ i,y, he. that 
the Gaussians from different photon numbers should be 
aligned. This is physically motivated by the fact that the 
detector cannot distinguish between cases where n pho¬ 
tons were input and cases where n+1 photons were input 
and one photon was lost. Removing this constraint does 
not alter the solution significantly, beyond leading to a 
slight jitter in the location of the peaks for each photon 
number. However, this jitter complicates the additional 
analysis that we carry out, particularly with regard to 
compensating for the uncertainty in the probe state en¬ 
ergies (as discussed in Section ). No constraint is placed 
on cr n j. 

Substituting equation (5) into equation (4) of the main 
text, we find that 

p{ s = ^ ' F'a,n/3n,jJV(s\l-l n j , (6) 

n,j 

where x is used a shorthand to denote the set of all the 
parameters 8 n ,j- Hn,j, a n,j> in order to make the depen¬ 
dence on the model explicit. 

This expression gives the posterior probability density 
for the TES detector producing an outcome s in our 
model, given an input coherent-state probe |a)(a|. We 
wish to maximise this posterior probability for the data 
that we measure. Typically, maximum likelihood esti¬ 
mation [23] is carried out based on a set of observed out¬ 
comes {sii}- In this case, the quantity to be maximised 
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is the log-likelihood 


£ = lo § (^n£( Sil l ai ’X)J 
= 5 Z lo s( p ( Sil l ai ’^) ■ 


( 7 ) 


However, due to the large number of data points that 
we sample, evaluation of this sum becomes impractical. 
Instead, we used our data set {sii} to estimate the out¬ 
come probability density q(s\a k ) for each \a k ){oik\- We 
can use this distribution to rewrite equation (7) as 

C = Nk 9( s l“fc) lo S (?>OI“k.x)) ds, 

k J 

where N k is the total number of samples measured at 
each value of a k ■ Since we measured the same number 
of samples per coherent state value, we will neglect this 
constant factor that has no impact on the maximum like¬ 
lihood estimation. 

The full expression for the log-likelihood therefore be¬ 
comes 

£ = ^2 / dsq(s\a k ) ■ • • 


log I ^ ' £qfc,n Pn,j j , <J n j) 1 ■ 


In order to maximise this log-likelihood, we follow the 
standard approach [23] of taking derivatives with respect 
to each parameter in the model. For example, differenti¬ 
ating with respect to gives 


DC 

dHi 


_^ r 

~ Q(9\<Xk) 7s,k,n,j&n,j(s - Hn,j) ds 

,3 i, ' 


* n ,3 k 

in which we have defined 

Fot. k ,n fin,j *A/*, 0' n j') 


'T s,k,n,j — 


En,i £ oc k ,n Pn,j V(s|/X n J, Un,j) 


Rearranging leads to the following expression for fj, n j 

Unj = vy— / 9(s|afc) 7s,fe,rt ,j s ds (8) 
k J 

where 

N n,j = T. J q(s\a k )'y s ,k,n,jds. 


Similarly we find that 
1 


&n.i — 


—f q(s\a k ) 7s,k,n,j (S - Hn,j) 2 ds (9) 
™n,j J 


and 


Pn,j = -Jr 1 , where N n = ^ N„ tj . 


( 10 ) 


Note that these expressions for the parameters are de¬ 
pendent on 7 s ,fe,n,j, and therefore do not form a closed- 
form solution. This means that the optimal solution can¬ 
not be found analytically. However, it can be shown that 
a simple routine consisting of the repeated application 
of two steps will converge to a solution [23]. In the first 
step, the current values of the parameters are used to 
calculate 7 s ,k,n,j- This is then used in the second step to 
re-estimate the optimal values of the parameters using 
equations ( 8 ), (9) & (10). 


CALIBRATED LIGHT SOURCE 


a) 

1 0-200 pW 


b) 



FIG. 6. Calibrated light source. Because the dynamic range 
of our power meter is insufficient to span the attenuation 
required to reduce the coherent-state energy to the single¬ 
photon level, we perform the calibration in two steps at the 
expense of increased error. We use the laser diode running in 
CW mode to calibrate the attenuators, since the power meter 
is most accurate in this mode, a) We first take a series of 
measurement of the output powers Pia and P\b for a range 
of input powers to the fibre beam splitter. This lets us cali¬ 
brate the output power in port IB if we know the power in 
port 1A. b) We connect a second fibre beam splitter to the 
first one before we calibrate it to make sure the effect of the 
FC/FC connection is properly accounted for. We then make 
a series of measurements of P 2 A and Pia for a range of in¬ 
put powers. Concatenating these results we now know the 
output power in port IB given the recorded power in 2A. c) 
For the detector characterisation we switch the laser to pulsed 
operation and attenuate the input light to the nanowatt level. 



We built a calibrated coherent state source based on 
a Newport 918D-IG-OD3R power meter, which provides 
a specified calibration accuracy of 2 % of absolute power 























and a linearity of better than 0.5%. This power meter was 
used to calibrate a series of fixed attenuators to reduce 
the output from a pulsed laser to the single-photon level 
with a known mean-photon number per pulse [28]. 

Our method uses a fibre beam splitter with a fixed 
fibre attenuator connected on one of the output ports, 
as shown in Fig. 6a. As long as this attenuation is well 
within the linear dynamic range of the power meter, we 
can obtain a calibration curve for the combined splitter- 
attenuator device relating the power measured at P\a to 
the power at P\b- In our case, the attenuation required 
to reach the single photon level is much greater than the 
dynamic range of the power meter. This forces us to 
use a second, calibrated splitter-attenuator device in se¬ 
ries with the first Fig. 6b. A weighted total least-squares 
algorithm [33] was used to find the total attenuation tak¬ 
ing account of the absolute power errors in both vari¬ 
ables. The total attenuation is given by the product of 
the two attenuators, but the errors in the measurements 
add linearly since they are not independent. Thus our 
final calibrated attenuation is found to be 


»7att = (2.10 ±0.16) x 1(T 6 , 


which relates the power measured at the monitor port 2 A 
to the power at port 2 B (Fig 6c). A variable attenuator 
is used to set the input power level before the calibrated 
attenuator so that we can probe our detector with a va¬ 
riety of coherent state amplitudes. We monitor the input 
power to the attenuator using port 2 A and calculate the 
average photon number per pulse in port IB which is 
coupled to the TES. The value of r] at t also includes a 
correction to account for the Fresnel reflection from the 
unterminated fibre when plugged into the monitor power 
meter, which leads us to underestimate the total power 
that will be input when this fibre is instead directly cou¬ 
pled to the fibre leading to the TES. Fibre specifications 
give this loss at about 3.3% but there is a 1% uncertainty 
in this figure [34], 

As we discuss in the Methods section of the main text, 
the POVM element coefficient 6 n {s) gives the probabil¬ 
ity density p(s|n) that we will measure outcome s given 
n input photons. This probability is actually p(s|n, ^ a tt) 
since rj att is a variable in our tomography calculations. 
Our uncertainty in ry a tt must therefore be accounted for. 
Based on our error analysis (and assuming normally dis¬ 
tributed errors), we can estimate the probability density 
p(r] att) for rj a tt- Additionally, we can calculate p{s\n, t) 
for different 7? a tt- Combining these, we can incorporate 
this statistical uncertainty into our POVM using 



p(s\n,r] a tt)p{v att )d V att • 


The results of this analysis are shown in Fig. 2b of the 
main text. 


ESTIMATING THE SYSTEM DETECTION 
EFFICIENCY 

The POVM elements that we calculate using detec¬ 
tor tomography completely characterise the detector re¬ 
sponse. Model-free detector tomography is obtained by 
treating the detector as a black box, and so in principle 
does not contain information on the system detection ef¬ 
ficiency, i.e. the loss that occurs between the input and 
the detector. 

However, the less general, but physically motivated 
model-based detector tomography approach that we have 
adopted can allow us to make an estimate of this effi¬ 
ciency. As noted above, we assume that the response 
of the detector to each photon number is composed of 
several Gaussian elements. We can make the further as¬ 
sumption that these different Gaussian elements occur 
due to the action of loss on an initial Fock state, leading 
to a statistical mixture of photon numbers at the detec¬ 
tor. Therefore the heights of these elements should follow 
a binomial distribution within each Fock state POVM el¬ 
ement. For a given system detection efficiency, it is then 
possible to calculate the expected height of these Gaus¬ 
sian elements and compare them to the actual tomogra¬ 
phy output. We used a numerical routine to find the loss 
level that minimised the L2 norm between this predicted 
output and the tomography data. 

This analysis suggests that our system detection ef¬ 
ficiency is 0.98 (±0.02/—0.08). The asymmetric uncer¬ 
tainty arises as the efficiency is upper bounded at 1.0. 
Additionally, we find a strong agreement between the 
predicted photon number distribution and the tomogra¬ 
phy data, as shown in Fig. 7, suggesting that our initial 
assumption is correct. 


IMPACT OF PHOTON-NUMBER PRIOR 
PROBABILITIES 

As we discuss in the main text, detector tomography 
gives the probability p(s\n) that a specific outcome s will 
occur given an n-photon input to the detector. How¬ 
ever, in typical experiments, we are actually interested 
in the complementary probability p(n|s) that the input 
contained n photons given that the detector measured 
outcome s (as calculated from the detector signal v(t) by 
s = f v(t)wi(t)dt). 

Calculating this requires Bayes’ theorem p(n\s) = 
p(s\n)p(n)/p(s) and thus depends on our prior proba¬ 
bility p{n) of an ?r-photon input. Here, as an exam¬ 
ple we consider two distinct priors which might arise in 
applications. First, we consider a Poisson distribution 
p(n\a) = |a| 2 "/n! which would result from a co¬ 

herent state input. We also consider a thermal distri¬ 
bution p(n\X) = (1 — A 2 )A 2n which describes a thermal 
state input and, importantly, the single-mode marginal 
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FIG. 7. Estimating the system detection efficiency. We can assume that the different Gaussian elements that make up the 
detector response to a given photon number are due to losses between the input and the detection, leading to a reduced number 
of photons actually being detected. This loss therefore leads to a binomial distribution of Gaussian response elements when 
an initial Foc.k state is input to the detector. The relative weights of each of these Gaussian elements, as inferred from our 
maximum likelihood estimation protocol, are plotted here in gold. From this data, we can carry out a numerical routine to 
determine the detection efficiency most consistent with our data. The predicted photon (and therefore Gaussian element) 
distribution resulting from this estimated detection efficiency is shown in blue for comparison. 


statistics of a spontaneous parametric down-conversion 
source. If one mode of such a source is sent to a detec¬ 
tor, p(n\s, A) represents the statistical mixture of photon 
numbers onto which the other mode is projected. Such 
information is extremely important for quantum infor¬ 
mation and metrology applications. 

Two example probability distributions p(n\s,a) and 
p(n\s, A) are plotted in Figs. 8 a & b. As can be seen, 
the two priors lead to significant differences in the distri¬ 
butions. For the thermal distribution, the thermal prior 
suppresses the overlap between the outcomes associated 
with neighbouring photon numbers. This is because, 


for small A, n + 1 input photons will occur much less 
frequently than n photons. Therefore the predominant 
overlap contribution, due to an n+ 1-photon input being 
detected in the space of outcomes most associated with 
n input photons, occurs correspondingly less frequently 
than genuine n-photon inputs. 

The Poissonian prior plotted in Fig. 8b has the op¬ 
posite effect as the thermal prior, since in this case an 
input of n + 1 photons is more probable than an input 
of n photons, and therefore the overlap is promoted. It 
should be noted that in both cases, due to the truncation 
of our detector tomography at 17 input photons, the dis- 
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tributions p(n\s) become inaccurate in regions in which 
significant contributions would be expected from photon 
numbers greater than this. In practice, this simply trans¬ 
lates to an operational requirement that detector tomog¬ 
raphy must be extended to include all photon numbers 
that are expected to contribute in any given experiment. 
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FIG. 8. Example distributions p(n|sii) from our tomography 
data, which give the probability that the input contained n 
photons given that the detector measured outcome s. The 
effect of the prior input photon number probabilities can be 
seen in the difference between a) a thermal distribution with 
A 2 = 0.1, and b) a Poisson distribution with \a\ 2 = 5. 


POST-SELECTING OUTCOMES TO IMPROVE 
CONFIDENCE 

For certain applications [35], it is important to max¬ 
imise the fidelity of the inferred detected state with a 
photon number state ( C n ). In these cases, the fidelity 
can be improved using post-selection strategies in which 
only a subset of outcomes are accepted. This is possi¬ 
ble to explore using our detector tomography data since 
our treatment has explicitly avoided any binning of out¬ 
comes. 



Sil 


FIG. 9. Post-selecting on outcomes within windows centred 
on the peak maxima can be employed to boost the confidence 
of detected photon states. 

One strategy is to only consider outcomes within win¬ 
dows centred on the peak maxima (Fig. 9). As would 
be expected, the highest confidence is obtained in the 
limit of the window width tending to zero, in which 
case the number of accepted outcomes would also tend 
to zero. This limit therefore upper bounds the perfor¬ 
mance of this strategy, and is plotted in Fig 3b of the 
main text. For our detector, the increase in confidence 
as compared to using the full space of outcomes is com¬ 
paratively modest, since the overlap between different 
photon number POVM elements is dominated by the de¬ 
tection efficiency. However, as the detection efficiency of 
detectors improves, the intrinsic overlap between neigh¬ 
bouring Gaussian peaks is expected to become increas¬ 
ingly important. In this case, this post-selection strategy 
should become more effective. 


















































































